Convergent Learning: Do different neural networks learn the same representations?
نویسندگان
چکیده
Recent successes in training large, deep neural networks (DNNs) have prompted active investigation into the underlying representations learned on their intermediate layers. Such research is difficult because it requires making sense of non-linear computations performed by millions of learned parameters. However, despite the difficulty, such research is valuable because it increases our ability to understand current models and training algorithms and thus create improved versions of them. We argue for the value of investigating whether neural networks exhibit what we call convergent learning, which is when separately trained DNNs learn features that converge to span similar spaces. We further begin research into this question by introducing two techniques to approximately align neurons from two networks: a bipartite matching approach that makes one-to-one assignments between neurons and a spectral clustering approach that finds many-to-many mappings. Our initial approach to answering this question reveals many interesting, previously unknown properties of neural networks, and we argue that future research into the question of convergent learning will yield many more. The insights described here include (1) that some features are learned reliably in multiple networks, yet other features are not consistently learned; and (2) that units learn to span low-dimensional subspaces and, while these subspaces are common to multiple networks, the specific basis vectors learned are not; (3) that the average activation values of neurons vary considerably within a network, yet the mean activation values across different networks converge to an almost identical distribution.
منابع مشابه
EMG-based wrist gesture recognition using a convolutional neural network
Background: Deep learning has revolutionized artificial intelligence and has transformed many fields. It allows processing high-dimensional data (such as signals or images) without the need for feature engineering. The aim of this research is to develop a deep learning-based system to decode motor intent from electromyogram (EMG) signals. Methods: A myoelectric system based on convolutional ne...
متن کاملHandwritten Character Recognition using Modified Gradient Descent Technique of Neural Networks and Representation of Conjugate Descent for Training Patterns
The purpose of this study is to analyze the performance of Back propagation algorithm with changing training patterns and the second momentum term in feed forward neural networks. This analysis is conducted on 250 different words of three small letters from the English alphabet. These words are presented to two vertical segmentation programs which are designed in MATLAB and based on portions (1...
متن کاملمعرفی شبکه های عصبی پیمانه ای عمیق با ساختار فضایی-زمانی دوگانه جهت بهبود بازشناسی گفتار پیوسته فارسی
In this article, growable deep modular neural networks for continuous speech recognition are introduced. These networks can be grown to implement the spatio-temporal information of the frame sequences at their input layer as well as their labels at the output layer at the same time. The trained neural network with such double spatio-temporal association structure can learn the phonetic sequence...
متن کاملLearning and Evolution in Neural Networks
The paper describes simulations on populations of neural networks that both evolve at the population level and learn at the individual level. Unlike other simulations, the evolutionary task (finding food in the environment) and the learning task (predicting the next position of food on the basis of present position and planned network's movement) are different tasks. In these conditions both le...
متن کاملk-Shot Learning for Action Recognition
In the problem of k-shot learning, a model must learn to reliably classify an example having seen only k previous instances of examples of the same class. With recent success in using memory in neural networks to perform kshot learning, we propose a technique that uses MemoryAugmented Neural Networks to perform k-shot learning for action recognition in videos. We believe the use of memory will ...
متن کامل